547 research outputs found

    When Provenance Aids and Complicates Reproducibility Judgments

    Full text link
    It is well-established that the provenance of a scientific result is important, sometimes more important than the actual result. For computational analyses that involve visualization, this provenance information may contain the steps involved in generating visualizations from raw data. Specifically, data provenance tracks the lineage of data and process provenance tracks the steps executed. In this paper, we argue that the utility of computational provenance may not be as clear-cut as we might like. One common use case for provenance is that the information can be used to reproduce the original result. However, in visualization, the goal is often to communicate results to a user or viewer, and thus the insights obtained are ultimately most important. Viewers can miss important changes or react to unimportant ones. Here, interaction provenance, which tracks a user's actions with a visualization, or insight provenance, which tracks the decision-making process, can help capture what happened but don't remove the issues. In this paper, we present scenarios where provenance impacts reproducibility in different ways. We also explore how provenance and visualizations can be better related

    Doctor of Philosophy

    Get PDF
    dissertationServing as a record of what happened during a scientific process, often computational, provenance has become an important piece of computing. The importance of archiving not only data and results but also the lineage of these entities has led to a variety of systems that capture provenance as well as models and schemas for this information. Despite significant work focused on obtaining and modeling provenance, there has been little work on managing and using this information. Using the provenance from past work, it is possible to mine common computational structure or determine differences between executions. Such information can be used to suggest possible completions for partial workflows, summarize a set of approaches, or extend past work in new directions. These applications require infrastructure to support efficient queries and accessible reuse. In order to support knowledge discovery and reuse from provenance information, the management of those data is important. One component of provenance is the specification of the computations; workflows provide structured abstractions of code and are commonly used for complex tasks. Using change-based provenance, it is possible to store large numbers of similar workflows compactly. This storage also allows efficient computation of differences between specifications. However, querying for specific structure across a large collection of workflows is difficult because comparing graphs depends on computing subgraph isomorphism which is NP-Complete. Graph indexing methods identify features that help distinguish graphs of a collection to filter results for a subgraph containment query and reduce the number of subgraph isomorphism computations. For provenance, this work extends these methods to work for more exploratory queries and collections with significant overlap. However, comparing workflow or provenance graphs may not require exact equality; a match between two graphs may allow paired nodes to be similar yet not equivalent. This work presents techniques to better correlate graphs to help summarize collections. Using this infrastructure, provenance can be reused so that users can learn from their own and others' history. Just as textual search has been augmented with suggested completions based on past or common queries, provenance can be used to suggest how computations can be completed or which steps might connect to a given subworkflow. In addition, provenance can help further science by accelerating publication and reuse. By incorporating provenance into publications, authors can more easily integrate their results, and readers can more easily verify and repeat results. However, reusing past computations requires maintaining stronger associations with any input data and underlying code as well as providing paths for migrating old work to new hardware or algorithms. This work presents a framework for maintaining data and code as well as supporting upgrades for workflow computations

    Provenance for computational tasks: a survey

    Get PDF
    Journal ArticleThe problem of systematically capturing and managing provenance for computational tasks has recently received significant attention because of its relevance to a wide range of domains and applications. The authors give an overview of important concepts related to provenance management, so that potential users can make informed decisions when selecting or designing a provenance solution

    VisComplete: automating suggestions for visualization pipelines

    Get PDF
    Journal ArticleBuilding visualization and analysis pipelines is a large hurdle in the adoption of visualization and workflow systems by domain scientists. In this paper, we propose techniques to help users construct pipelines by consensus-automatically suggesting completions based on a database of previously created pipelines. In particular, we compute correspondences between existing pipeline subgraphs from the database, and use these to predict sets of likely pipeline additions to a given partial pipeline. By presenting these predictions in a carefully designed interface, users can create visualizations and other data products more efficiently because they can augment their normal work patterns with the suggested completions. We present an implementation of our technique in a publicly-available, open-source scientific workflow system and demonstrate efficiency gains in real-world situations

    Functions that are the Directed X-Ray of a Planar Convex Body

    Get PDF
    We characterize functions that are the directed X-ray of a planar convex body from a source that is a positive distance from the body. In addition to a concavity condition the necessary and sufficient conditions involve the structure of points of zero curvature and a priori estimates for derivatives of the directed X-ray near supporting rays and points of zero curvature. The techniques employed also lead to explicit methods for constructing families of planar convex bodies with a common directed X-ray

    Toward Systematic Design Considerations of Organizing Multiple Views

    Full text link
    Multiple-view visualization (MV) has been used for visual analytics in various fields (e.g., bioinformatics, cybersecurity, and intelligence analysis). Because each view encodes data from a particular perspective, analysts often use a set of views laid out in 2D space to link and synthesize information. The difficulty of this process is impacted by the spatial organization of these views. For instance, connecting information from views far from each other can be more challenging than neighboring ones. However, most visual analysis tools currently either fix the positions of the views or completely delegate this organization of views to users (who must manually drag and move views). This either limits user involvement in managing the layout of MV or is overly flexible without much guidance. Then, a key design challenge in MV layout is determining the factors in a spatial organization that impact understanding. To address this, we review a set of MV-based systems and identify considerations for MV layout rooted in two key concerns: perception, which considers how users perceive view relationships, and content, which considers the relationships in the data. We show how these allow us to study and analyze the design of MV layout systematically.Comment: Short paper with 4 pages + 1 reference page, 2 figures, 1 table, accepted at IEEE VIS 2022 conferenc
    • …
    corecore